-
Notifications
You must be signed in to change notification settings - Fork 832
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Thumb-2 Poly1305: implementation in assembly #7939
Conversation
This needs to be merged first: #7935 |
3b77224
to
eb76034
Compare
Implementation of ChaCha algorithm for ARM Thumb-2.
eb76034
to
e5ead65
Compare
"LDR r3, [%[key], #4]\n\t" | ||
"LDR r4, [%[key], #8]\n\t" | ||
"LDR r5, [%[key], #12]\n\t" | ||
"LDM r10, {r6, r7, r8, r9}\n\t" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a Cortex M7 (STM32H753ZI). This was with debug build.
arm-none-eabi-gcc "../Middlewares/Third_Party/wolfSSL_wolfSSL_wolfSSL/wolfssl/wolfcrypt/src/port/arm/thumb2-poly1305-asm_c.c" -mcpu=cortex-m7 -std=gnu11 -g3 -DUSE_HAL_DRIVER -DHAVE_PKCS11_STATIC -DSTM32H753xx -DDEBUG -c -I../Middlewares/Third_Party/FreeRTOS/Source/include -I"/Users/davidgarske/Projects/WolfSSL/STM/STM32H7/STM32H753/Middlewares/Third_Party/wolfPKCS11" -I../Middlewares/Third_Party/FreeRTOS/Source/portable/GCC/ARM_CM4F -I../Drivers/CMSIS/Include -I../Core/Inc -I../Drivers/STM32H7xx_HAL_Driver/Inc/Legacy -I../Drivers/CMSIS/Device/ST/STM32H7xx/Include -I../Middlewares/Third_Party/FreeRTOS/Source/CMSIS_RTOS_V2 -I../Drivers/STM32H7xx_HAL_Driver/Inc -I../Middlewares/Third_Party/wolfSSL_wolfSSL_wolfSSL/wolfssl -I../wolfSSL/. -I../wolfSSL -I../Middlewares/Third_Party/wolfSSL_wolfSSL_wolfSSL/wolfssl/ -I../Middlewares/Third_Party/wolfSSL_wolfSSH_wolfSSH/wolfssh/ -I../Middlewares/Third_Party/wolfSSL_wolfMQTT_wolfMQTT/wolfmqtt/ -I../wolfTPM -I../Middlewares/Third_Party/wolfSSL_wolfTPM_wolfTPM/wolftpm/ -O0 -ffunction-sections -fdata-sections -Wall -fomit-frame-pointer -fstack-usage -fcyclomatic-complexity -MMD -MP -MF"Middlewares/Third_Party/wolfSSL_wolfSSL_wolfSSL/wolfssl/wolfcrypt/src/port/arm/thumb2-poly1305-asm_c.d" -MT"Middlewares/Third_Party/wolfSSL_wolfSSL_wolfSSL/wolfssl/wolfcrypt/src/port/arm/thumb2-poly1305-asm_c.o" --specs=nano.specs -mfpu=fpv5-d16 -mfloat-abi=hard -mthumb -o "Middlewares/Third_Party/wolfSSL_wolfSSL_wolfSSL/wolfssl/wolfcrypt/src/port/arm/thumb2-poly1305-asm_c.o"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
Implementation of ChaCha algorithm for ARM Thumb-2. Implementation of Poly1305 algorithm for ARM Thumb-2.
e5ead65
to
27033c2
Compare
retest this please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wonderful improvement!
On an STM32H7 Cortex-M7 at 480MHz:
CHACHA: 50% faster
CHA-POLY: 85% faster
POLY1305: 276% faster
Before:
CHACHA 6 MiB took 1.000 seconds, 6.177 MiB/s
CHA-POLY 3 MiB took 1.004 seconds, 3.404 MiB/s
POLY1305 12 MiB took 1.000 seconds, 12.207 MiB/s
After:
CHACHA 9 MiB took 1.000 seconds, 9.204 MiB/s
CHA-POLY 6 MiB took 1.000 seconds, 6.299 MiB/s
POLY1305 34 MiB took 1.000 seconds, 33.789 MiB/s
Description
Implementation of Poly1305 algorithm for ARM Thumb-2.
Testing
Tested with QEMU.
With and without: -DWOLFSSL_SP_NO_UMAAL
Checklist